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We represent the number of m x n non-negative integer matrices (contingency ta¬ 
bles) with prescribed row sums and column sums as the expected value of the per¬ 
manent of a non-negative random matrix with exponentially distributed entries. We 
bound the variance of the obtained estimator, from which it follows that if the row 
and column sums are bounded by a constant fixed in advance, we get a polynomial 
time approximation scheme for counting contingency tables. We show that the com¬ 
plete symmetric polynomial of a fixed degree in n variables can be e-approximated 
coefficient-wise by a sum of powers of O(logn) linear forms, from which it follows 
that if the row sums (but not necessarily column sums) are bounded by a constant, 
there is a deterministic approximation algorithm of rn 0 ( logr h complexity to compute 
the logarithmic asymptotic of the number of tables. 


1. Introduction and main results 

(1.1) Contingency tables. Contingency tables are non-negative integer matrices 
with prescribed row and column sums, called marginals. The problem of comput¬ 
ing the number of contingency tables with given marginals has attracted a lot of 
attention recently, see [DG95], [D+97], [Mo02], [CD03]. The counting problem is 
motivated by applications to statistics, combinatorics, representation theory, and 
is interesting in its own right, cf. [DG95]. 

Let us consider non-negative integer mxn matrices with the row sums n,... , r m 
and the column sums C\,... ,c n such that r\ + ... + r m = c\ + . . . + c n — N. If the 
number m of rows and the number n of columns are fixed in advance, the number 
of such matrices can be computed in polynomial time (that is, in time polynomial 
in log A) since the problem reduces to counting integer points in a polytope in fixed 
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dimension, see [Ba94]. In fact, one does not need to apply the counting algorithm 
in full generality since the polytope in question, the transportation polytope of 
11011 -negative matrices with prescribed row and column sums, is either totally uni- 
modular, or a straightforward “combinatorial” degeneration of a totally unimodular 
polytope, a fact used in [DS03]. 

If one of the dimensions (for example, the number of columns) is allowed to 
grow, the exact counting becomes difficult. As is shown in [D+97], exact counting 
is j^P -hard already for 2 x n matrices. The hardness result uses that the total sum 
N can become exponentially large in n (so that log N is polynomial in n). If the 
number m of rows is fixed, a dynamic programming based algorithm computes the 
number of tables in time polynomial in N, thus resulting in a pseudo-polynomial 
algorithm, cf. [CD03]. 

On the other hand, Dyer, Kannan, and Mount [D+97] have shown that if all 
the marginal r % . Cj are not too small (r, = 0(n 2 m) and c 3 — 0(m 2 n)), then the 
Monte Carlo based approach allows one to approximate the number of contingency 
tables within a prescribed relative error e > 0 in time polynomial in to, n and 
e -1 . In this case, the number of tables is well approximated by the volume of 
the corresponding polytope. Subsequently, Morris improved the bounds to r* = 
0(n 3 / 2 TO log to) and Cj = Q(m 3//2 nlogn). Combining the dynamic programming 
approach with the volume approximation idea, Cryan and Dyer [CD03] obtained 
a randomized polynomial time approximation algorithm in the situation when the 
number of rows is fixed. This was later generalized in [C+04], 

Thus the most difficult case is that with N “moderately large” with respect to 
to and n. 

If both row sums r t and column sums c 3 are small, A. Bekessy, P. Bekessy, and 
Komlos [B+72] proved the asymptotic formula 



for the number of tables assuming that N —> +00 while the marginals remain 
bounded by a constant, fixed in advance: +,<+■ < p. In [B+72], the authors proved 
that the relative error of this approximation is O (A r ~ 1 / 2 log A r ) and conjectured 
that it is 0(A^ _1 ). Essentially, formula (1.1.1) counts contingency tables with 
entries not exceeding 2. 

Good and Crook [GC77] make a heuristic argument that the formula should be 
valid for contingency tables under more general conditions of r^Cj/N being small. 

If to = n and r t = c 3 = 2. an explicit generating function for the number of 
tables is known, see Corollary 5.5.11 of [St99], which leads to a pseudo-polynomial 
algorithm to compute the number of such tables exactly. 

Suppose now that we count every table ( dij ) with weight 



(the Fisher-Yates or the multiple hypergeometric statistics). In this case, the 
weighted number of tables with row sums r 1 , ... , r m and column sums Ci,... , c n 
is exactly equal to 


(1.1.3) 


N\ 

rq! • • -r m !cq! ■ • -c n V 


(1.2) Symmetric polynomials. For a positive integer r, the complete symmet¬ 
ric polynomial h r of degree r in n variables aq, ... ,x n is the sum of all distinct 
monomials 


n 

x a = x ± 1 • • • x^™ where o>i = r and cp > 0 for i — 1,... , n. 

1=1 


A well-known and easy to prove result states that the number of to x n contin¬ 
gency tables with row sums rq, ... , r m and column sums Ci,... , c m is equal to the 
coefficient of the monomial x'f 1 • • • x c pp in the product 


( 1 . 2 . 1 ) 


ft ri (x)"-A rm (x), 


see, for example, Proposition 7.5.1 of [St99]. Similarly, if e r is an elementary 
symmetric polynomial of degree r in aq,... ,x n (that is, the sum of all square- 
free monomials of degree r), then the coefficient of the monomial x'j 1 • • -x^p in the 
product 

e ri (x) • • • e rm (x) 

is the number of 0-1 matrices with the row sums ri,... ,r m and the column sums 
ci,... , c n , see Proposition 7.4.1 of [St-99]. 

Let us “approximate” every polynomial h r in the product (1.2.1) by the power 
(aq + . . . + x n ) r . The monomial expansion of the power contains all the same 
monomials x“ of degree r, only the coefficient of the monomial x ... x% n is equal 
not to 1 but to r\/a\\■ ■ ■ a n \. Consequently, the coefficient of x\ 1 ---x^p in the 
product 


(xi + ... + x n ) ri ■ ■ ■ (xi + ... + x n ) rm —{x\ H-h x n ) N , 

where N = r± + ... + r m 

is equal to rq! • • -r m ! times the number of contingency tables with the row sums 
rq,... , r m and the column sums ci,... , c n , given that the weight of the table (d t? ) 
is the hypergeometric weight (1.1.2). On the other hand, this coefficient is equal to 
N\/c\ \ ■ ■ ■ c n l, from which we deduce (1.1.3). 

As follows from formula (1.1.1), the Fisher-Yates statistics provides a reasonably 
good approximation to the uniform measure on contingency tables if the row and 
column sums are small. However, if only the row sums r, are small but column 
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sums Cj are allowed to be large (for example, if m 3> n), the approximation (1.1.1) 
is no longer valid. 

In this paper, we present an algorithm for asymptotic computation of the number 
of contingency tables where the row sums r t are small (and column sums c 3 are 
allowed to be large). Namely, for any e > 0 and a positive integer p. fixed in 
advance, we present an algorithm, which, given positive integers ri,... , r m < p 
and positive integers Ci,... , c n , approximates the number of contingency tables 
with row sums n,... ,r m and column sums ci,... , c n within a factor of (1 — e) N , 
where N = n + ... + r m — c\ + ... + c n . The algorithm has a quasi-polynomial 
complexity of m °( logn ). "We present the algorithm in Section 3. The algorithm 
is based on the observation that n-variate complete symmetric polynomials h r for 
small (fixed) r can be approximated by polynomials of 0(log n) rank. Namely, we 
prove the following result. 

(1.3) Theorem. Let us fix a positive integer r and an e > 0. Then there exists a 
constant k — n(r, e) > 0 with the following properties. For any integer n > 2, there 
exist k < k In n linear forms £i : M n —» M such that for the polynomial 


k 

h r = ^^( X )= ^ h r ,a* a , 

1=1 cti,... ,a n >0 

ai + ...+a n =f 


we have 

(l-e) r <h r ,a<(l+e) r 

for all non-negative integer vectors a — (oq,... , a n ) with a± + ... + a n — r. 

Moreover, we present a polynomial time algorithm to construct forms £i. Sim¬ 
ilar result holds for elementary symmetric functions e r , which leads to a counting 
algorithm for 0-1 matrices. 

(1.4) Theorem. Let us fix a positive integer r and an e > 0. Then there exists a 
constant k — n(r, e) > 0 with the following properties. For any integer n > 2, there 
exist k < k In n linear forms lij : M n —> M such that for the polynomial 


k r 

&r ^ ^ ^ ^ * * * Xi r , 

i=l j —1 ,i r } 

l<il 


we have 

(l-e) r <e r , 7 <(l + e) r 
for all r-subsets I C (1,... , n}. 

Let us fix a positive integer k. Let us “approximate” every polynomial h r in the 
product (1.2.1) by a homogeneous polynomial of degree r that is a product of poly¬ 
nomials h s with s < k. Then the coefficients of xfi ■ ■ ■ x'fi 1 in the product (1.2.1) enu¬ 
merates contingency tables with weights “interpolating” between the Fisher-Yates 
statistics for k = 1 and the uniform measure on tables for k > max{ri,... , r m }. 
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As a by-product of our approach we express the number of contingency tables as 
the expectation of the permanent of a random matrix. The permanent of an N x N 
matrix A = (ay) is expressed by the formula 

N 

per A — IK(0> 

&ESn *=1 


where cr ranges over the symmetric group Sn of all permutations of the set 
{1,... ,N}. Recently, Jerrum, Sinclair, and Vigoda constructed a randomized 
polynomial time approximation scheme to compute the permanent of a given 11011 - 
negative matrix [J+04]. As a corollary, they obtained a randomized polynomial 
time approximation scheme to count 0-1 matrices with prescribed row and column 
sums. 

Recall that a random variable 7 is standard exponential if 


P ( 7 > f) 


e t for f > 0 
1 for t < 0 . 


We obtain the following result. 

(1.5) Theorem. Given positive integers 77 ,... , r m and ci,... , c n such that 77 + 
• • ■+r rn = ci + .. . + c n = N, let us consider the N xN random matrix A constructed 
as follows. We represent the set of rows of A as a disjoint union of m subsets 
Ri,... , R m , where \Ri\ = r t for i — 1,... , m and the set of columns of A as a 
disjoint union of n subsets G\,... , C n , where Cj\ — Cj. Thus A is split into mn 
blocks Ri x Cj. We sample mn independent standard exponential random variables 
ji : j ,i = l,... , rri, j = 1,... , n and fill the entries of the block Ri x Cj by the copies 
of 7 ij. Let a — per A, so a is a function of the random variables 7 ^. 

Then 

(1) The number of m x n contingency tables with the row sums 77 ,... , r m and 
column sums < 7 ,... , c n is equal to 


Ect 

77 ! • • -r m !ci! ■ • -c n V 


(2) We have 

, r\2N. 

E 2 ct - 

(3) Suppose that ri, Cj < p for some number p and all i,j. Then there exists a 
constant n — n(p), such that 


Ea 2 

ESa 


< K. 


One can choose n(p) 


= exp{p 2 ( 2 /5 )!}, 
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We prove Theorem 1.5 in Section 4. Let us fix a number 0 < p < 1, for example 
p — 2/3. As follows by the Chebyshev inequality, if the row and column sums 
are bounded in advance, the average of 0(e~ 2 ) permanents of randomly generated 
NxN matrices, with probability at least p approximates the number of contingency 
tables within a relative error e. In view of [J+04], we obtain a polynomial time 
approximation algorithm for counting contingency tables when the row and column 
sums are bounded by a constant, fixed in advance. 

( 1 . 6 ) Counting with weights. A natural generalization of the counting problem 
is counting with multiplicative weights. Given an m x n matrix W — ( Wij ) of 
weights, let us define the weight of an mxn non-negative integer matrix D = (d t .j) 
as 

11A 

ij 

For example, if w^ G (0,1} then the weight of D is 1 if and only if dij > 0 implies 
Wij = 1. In this case, weighted counting implies counting matrices with allowed 
entries ( i,j) for which w l3 — 1. Our results for asymptotic counting of contingency 
tables with small row sums admit generalization to counting with weights, provided 
the rank of the weight matrix W is fixed. Similarly, Theorem 1.5 admits a straight¬ 
forward generalization for counting with weights: the 7 jj entry of matrix A needs 
to be multiplied by w l3 . Part (2) also remains valid, although Part (3) does not. 
Finally, we note that the weighted modification of the Fisher-Yates statistics can 
be easily expressed as a permanent. 

(1.7) Theorem. Given positive integers 7q,... , r m and ci,... , c n such that rq + 
■ ■ ■ + r m — ci + ... + c n — N, and a non-negative mxn matrix W — ( w^), let us 
consider the NxN matrix A constructed as follows. We represent the set of rows 
of A as a disjoint union of m subsets R±, ... , R m , where \Ri\ — ri for i = 1 ,... , m 
and the set of columns of A as a disjoint union of n subsets G\.... , C n , where 
\C 3 \ — Cj. Thus A is split into mn blocks Ri x Cj. Let us fill the entries of the block 
Ci x Rj by w^. Then the total weight of m x n contingency tables with the row 
sums r 1 ,... , r m and column sums <q,... , c n , where the table D — ( dij) is counted 
with the weight 


is equal to 

per A 

r ± \ ■ ■ -r m !ci! • • • c n !' 

We prove Theorem 1.7 in Section 4. 

2. Preliminaries: a scalar product in the space of polynomials 

We will use a certain scalar product in the space V n of real n-variate polynomials. 
There are many ways to define it. The most straightforward way is to define the 
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scalar product of two monomials 


f op! • • • a n ! if a — b — (op,... ,a n ) 
\ 0 otherwise. 


A more formal way is to write 


if >9) 


mow 


x=(0,... ,0), 


where f(d) is the differential operator 


m 


d d \ 

dxi ’' " ’ dx n ) ' 


The most invariant way is to consider the complex space C n , the Gaussian measure 
u n there with the density 

^e -11 * 112 where ||z|| 2 = |Ci | 2 + • • • + ICn| 2 for z = (Cl, • • • , Cn), 

7r t 

and let 

{f,g)= [ f( z )g( z ) du n . 

Jc n 

From this representation or otherwise, cf. [Ba96], it follows that the scalar prod¬ 
uct is invariant under orthogonal transformations of the coordinates: if U is an 
orthogonal transformation of and fi and gi are defined by fi(x) — f(Ux ) 
and gi(x) — g(Ux) respectively, then (f,g) = {fi,gi)- Equivalently, for a linear 
transformation A : W 1 —> M n , let us define the polynomial Af by 

A/(x) = /(A*x) for x G M n , 

where A* is the conjugate transformation. Then 

(Af,g) = (f,A*g). 

The importance of this scalar product for us is that we can express the coefficient 
of x“ in / as the scalar product 


(/,x a ) 
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(2.1) Complexity of computing the scalar product. Suppose that / and g are 

n -variate homogeneous polynomials of degree k given by their monomial expansions 


/( x ) = ^ and = 9a * a - 
a a 


Then, to compute (/, g) one needs to sum up at most ( n+ ^ x ) terms: 

(f,9)= X! «l! ' ' •«n!/afi r a 

a=(a?i,... ,a n ) 

Taking into account computation of factorials, one can compute the scalar product 
using O (jfe^+J- 1 )) arithmetic operations. In particular, if the number of variables 
n is fixed, we get a polynomial time algorithm. We will also be interested in the case 
of n — 0(logk), in which case we get an algorithm of a quasipolynomial fc°( logfe ) 
complexity. 

Generally, if the polynomials / and g are defined by their “black boxes”, which, 
for any given x = (aq,... , x n ) compute the values /(x) and g(x), we can obtain the 
monomial expansions of /(x) and ry(x) via the standard procedure of interpolation 

in O ^( n+ £ _1 ) j time (provided n and k are known in advance), cf. [KY91] for 
the sparse version. Again, if n is fixed, we get a polynomial time algorithm and if 
n = 0(log /.:). we get an algorithm of a quasipolynomial complexity. 

The invariance of the scalar product under the action of the orthogonal group 
often allows us to reduce the number of variables. 

(2.2) The rank of a polynomial. Let / : M n —> M be a polynomial. We say 
that rank / < r if there are r linear forms i % : M n —> M, i — 1,... ,r and a 
polynomial q : W' —> M such that 

/(x) = q (fi(x), ... ,4(x)) for x = (xi,... , x n ). 

Suppose we want to compute the scalar product ( f,g ), where rank / < r and / 
is represented as a polynomial q in linear forms £i,... ,£ r - Let ei,... ,e r be the 
coordinate linear forms 

ei(x 1: ... ,x n ) = Xi for i = l,...,r. 

Let A be a linear transformation such that Ae^ = £^ for i — 1,... , r. Then 

(/, g) = (q (^ei,... , Ae r ), g) = (Aq(e i,... , e r ), g) = (q{e i, ... , e r ), A*g). 

Now we observe that q(e i,... , e r ) is a polynomial in the first, r variables xi, ... , x r . 
Replacing A*g by the “truncated” polynomial g obtained from A*g by setting 



x r+ 1 = ... = x n — 0 , we reduce computation of (/, g) to computation of the scalar 
product of two r-variate polynomials 

(f,g) = (q,g)- 

In practical terms, if the linear forms L-, are defined by 

£j(x) = a n x i + ... + a in x n , 

we fill the n x n matrix A* — (a^) by letting a, t? - = ay ? - for i < r and arbitrarily for 
larger i. Then we transpose A* to get A and compute g(x \,... , x r ) by substituting 
x r -|-i — ... = x n = 0 into g(Ax), where x is interpreted as the n-cohmm of variables 
X 1 5 • • • , x n . 

We will also need the following result, which can be considered as a complex 
version of the Wick formula, see for example, [Zv97]. Since the author was unable 
to locate it in the literature, we present its proof here. 

(2.3) Lemma. Let fi,gt : IT —> M, i = 1 ,m be linear forms and let F — 
fi ■ ■ ■ fm. an d G — gi ■ ■ ■ g m be their products. Let us define an m x m matrix 
B = (hi) h V h = {fi, gj) for i,j = 1,... , m. Then 

(F,G) = per B. 

Proof. First, we establish the formula in the particular case when g\ — ... = g rn = 
ei, the 1st coordinate linear form. In this case G = x ™, so letting u — (1,0,... , 0), 
we can write 

(F, G) = m\F(u) = m!/i(u) • ■ ■ fm(u). 

On the other hand, bij = ffiu), so 

per B = m\fi(u) ■ ■ -fm(u). 

Next, we establish the formula when g\ = ... — g m . In this case, we can write 
gi = Ae i for some linear transformation A of M"'. Hence 

(F, G) = {F, (Aei) m ) = (A*F, efif) = ((A*fi) ■ ■ ■ (A*fm), e?). 

Then A*fi are linear forms and as we already established, the scalar product is 
equal to the permanent of the matrix with the entries 

{A*fi, ef) = {fi, Aef) = {fi,gj) = bij. 

Finally, we establish the general case of the formula. Let us fix the forms fi,... , f m 
and consider both (F, G) and per B as functions of the forms gi,... ,g m . We observe 
that both {F, G) and per B are multilinear and symmetric in gi,... , g m . Hence we 
obtain the general case by polarization. Namely, let us fix < 7 i, • • • , g m - For real 
variables t — (ti, ... ,r m ), let us define the linear form g t — Tigi + ... + r m 5 r m . 
Let Gt — gf 1 and let B(t) be defined by bij(t) = {fi,gt)- Then both {F,Gf) and 
per B(t) are homogeneous polynomials of degree m in n,... , r m . Moreover, since 
both {F, G) and per B are multilinear and symmetric in gi ,... , g m , the coefficient of 
Ti • • • T m hi (F, G t ) is equal to m\{F, G) while the coefficient of iq • • • r m in per B(t) 
is equal to m! per H. Since we already proved that (F. Gf) = per B(t), the result 
follows. □ 
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3. LOW RANK APPROXIMATIONS OF SYMMETRIC POLYNOMIALS 


Let 7 be a random variable with the standard exponential distribution 


P(7 > T ) 


e r for r > 0 
1 for r < 0 . 


Hence for all integer a: > 0. 

E 7 q - «!. 

We will use the following straightforward result. 

(3.1) Lemma. Let 7 i,... , 7n be independent random variables having the stan¬ 
dard exponential distribution. Then, for any r > 0, 


E( 7 iXi + ... + 7 n x n ) r = r\h r (x 1 ,... ,x n ), 


the complete symmetric polynomial of degree r. 
Proof. We have 


(7lXi + . . . + 7 nXnY = 


E 

C^i,... , OL r>. ^ 0 
ai + ...+a n =r 


rp I 


! * * * • 


7i 


Oil 


"In X 1 


OL1 


...x: 


Since E 7 "* — af., the proof follows. 


□ 


In what follows, k — n(r, e) may denote various constants depending on r and e 
only. 

Given a “treshold” k > 0, we define the truncated random exponential variable 


by 


f 7 if 7 < k 

\ 0 if 7 > k, 


where 7 is the standard exponential random variable. The following is straightfor¬ 
ward. 


(3.2) Lemma. Given r and 5 > 0, there exists a constant n = n(r, 5) such that 
for the truncated random variable 7 , one has 


(l — 5)a\ <~E^ a < a\ for a = 0 ,...,r. 


□ 

Simple estimates show that one can choose 

k = O ( r In r + In - 

V 5 

Next, we are going to use a concentration inequality (Azuma’s inequality) for the 
sum of independent bounded random variables, see, for example, Theorem A. 16 of 
[AS92], 
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(3.3) 

E& 

S > 0 , 


Proposition. Let fi, ■ ■ ■ , 

(3 for i — 1,... ,m and — 


P 


£l + ■ ■ ■ + jrn 
m 


be independent random variables such that 
j3\ < k for some constant k. Then, for all 




^ 2g —m5 2 /2 k 2 


An important consequence of Proposition 3.3 is that for 5, k, and r fixed, we 
can make the bound 2e~ m5 / 2k less than n~ r by choosing m — 0(log n). 

Now we can prove Theorem 1.3. 

Proof of Theorem 1.3. We choose a 5 > 0 so that (1 — 5) > (1 — e) 1 / 2 and a 
threshold k — n(r, 5) so as to satisfy the conditions of Lemma 3.2. Then we sample 
the coefficients of the linear forms : M n —» M, i — 1,... , m independently at 
random from the truncated standard exponential distribution. Let 



i =1 


Then, each coefficient /? fV , of the monomial x a is the average of m independent 
random samples of the random variable 




7T Ql ---7T a ’ 

(%1 ! • 


Since r is fixed, all random variables f a remain uniformly bounded by some constant 
depending on r and e only. Moreover, (1 — e) r / 2 < E£ a < 1. Since for a fixed r, the 
number of ( n+1 r ~ ) of monomials x“ of multidegree a is bounded by a polynomial in 
n, by Proposition 3.3 we can choose m — O(logn) so that for each a, the probability 
that the average of f a does not lie within the interval [(1 — e) r , (1 + e) r ] does not- 
exceed (3( n+ (( -1 )) . Then, with probability at least 2/3, the average h r satisfies 

the conditions of Theorem 1.3. □ 


We sketch the proof of Theorem 1.4 below. 

Sketch of proof of Theorem l.f. With a surjective map lu : {1,... ,n} —» (1,... , r} 
we associate a homogeneous polynomial p UJ of degree r in n variables X\,... , x n , 
which is the product of r linear forms in x = (x\, ... , x n ): 

r 

pm - n yi x p 

* =1 ie{ i,... ,n} 

If u> is sampled from the uniform distribution on the space of all surjective maps 
(1,... , n} —> (1,. .. ,r} then the expectation E p u is a positive multiple of the 
elementary symmetric polynomial efc(x). Now we approximate E p u by a sample 
average of O(logn) polynomials p w . To sample cu, it suffices to sample u(i) inde¬ 
pendently for i = 1,... , n and accept the resulting map if it is surjective. The map 
fails to be surjective with probability at most r(l — l/r) n , which is negligible if r 
is fixed and n grows. □ 
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(3.4) Derandomization. Proofs of Theorem 1.3 and 1.4 allow us to construct 
polynomials h r and e r by averaging O(logn) polynomials that are built from linear 
functions with independent random coefficients. A closer look reveals that the coef¬ 
ficients do not have to be independent, but only r-wise independent (that is, every 
r coefficients should be independent). If r is fixed in advance, one can use construc¬ 
tions of small (polynomial size) sample spaces to simulate such random variables, 
cf. Section 2 of Chapter 15 of [AS92] and [E+98]. This leads to polynomial time 
deterministic algorithms for construction of polynomials h r and e r in Theorems 1.3 
and 1.4. 

(3.5) Asymptotic counting of contingency tables. Now we can come up with 
an algorithm for asymptotic counting of tables. Let us fix an e > 0 and a positive 
integer p. Suppose that 77 ,... ,r m < p. We construct polynomials h r as in Theorem 
1.3. The coefficient of x Cl ■ ■ ■x in the product 

#(x) = h ri ---h rm 

up to a factor of (1 — e) N is equal to the number of contingency tables with the 
row sums n, ... ,r m and the column sums ci,... , c n . Theorem 1.3 implies that the 
rank of H is O(logn). Hence, applying the algorithm of Section 2.2, we compute 
the required coefficient in m 0 ^° sn ^ time. 

This construction allows some extensions and ramifications. 

First, it extends to counting with weights (cf. Section 1.6) provided the rank 
of the weight matrix W — (w t .j ) is fixed in advance. To this end, we approximate 
the polynomial h ri (wnXi,... ,Wi n x n ) by the sample average of O(logn) powers 
of linear forms fp(x) for £ t (x) = where 7 ^ are independent ex¬ 

ponential random variables. If rank W is fixed in advance, the forms used in the 
approximation of h n span a subspace of O(logn) dimension. 

Second, we can compute approximately various other expressions of the type 
(Q(x),H(x)). For example, let C\ : C (1,... , IV} for k = 1,... , n be subsets of 
integers and let 

n c 

««=n e t 

fc=1 cElk 

Then, up to a factor of (1 — e)^, the value of (Q(x), H(x)) is equal to the number 
of contingency tables with the row sums 77 ,... ,r m and all possible column sums 
Ci,... , c n such that Cfc G Ck for k = 1,... , n. 

Finally, using Theorem 1.4 instead of Theorem 1.3 we obtain asymptotic enu¬ 
meration algorithms for 0-1 matrices. 

4. The estimator for the number of tables 

In this Section, we prove Theorems 1.5 and 1.7. 
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Proof of Theorem 1.5. Let us define an n -variate polynomial 

m 

^( x ) = n f ° r x = (%i, ■ ■ ■ ,x n ), 

i— 1 


where /i r (x) is the complete symmetric polynomial of degree r. Then the number 
of contingency tables with the row sums r 1 ,... , r m and column sums ci,... ,c n is 
equal to the coefficient of x c = x f 1 • • ■ in iL(x). Using the scalar product of 
Section 2, we can write this number as 

(x C ,g(x)) _ 

ci! ••• c n \ 

Using Lemma 3.1, we express H(x) as the expectation of a product of linear forms. 
Namely, we define random linear forms £ t by 

n 

£ i( X ) = 

1=1 


where 7 ^ are independent exponential random variables. Then, by Lemma 3.1, 


m 

= nII E «P( 


1 


X = 


r L • •-r m ! A 

i=i 


r 1 ! 


ElU'(x). 


i=1 


Let us denote L(x) = L - 1 (x) . Hence the number of contingency tables can be 

written as 

E(£(x), x c > 
ci! • • -c n !r 1! • - • r* m ! 

Since both L(x) and x c are products of linear forms, by Lemma 2.3 their scalar 
product evaluates by the permanent of the matrix of pairwise scalar products of 
linear forms U(x) and ej(x) = Xj, which is the matrix A. This proves Part (1) of 
the theorem. 

Let iS'/v be the symmetric group of all permutations of the set {1,... ,N}. De¬ 
noting the entries of A by a t j , we get 


Therefore, 


N 

a — £n and a 2 

<jESn i=1 


N 


e n 

4>,tp(zSN i= 1 


( Eft ) 2 = j (eJJo^p) 

4 >, iPeSn V *=1 / V *=1 

Eci ^ ^ E I j . 

0,peSjv \*=i J 


and 
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Hence we represented E 2 a and E a 2 as a sum of (AH) 2 terms parameterized by pairs 
of permutations (</>, ip). 

To prove Part (2), we show that every term in the expansion of E 2 o: is at most 
2 2A times the corresponding term in the expansion of Eo 2 . Indeed, each term in 
the expansion of E 2 o: is the product of the type 



where Uij and v %3 are 11011 -negative integers such that 

X!= X r '.' = N - 

ij ij 

The corresponding term in the expansion of Ea 2 is 


E 


n 


13 


7 


Uij+Vi 

ij 


T W/)" 
ij 


Hence the ratio is 


n 




(Uij + v. 




'Uij ^ij • 


< 


li- 


Vij _ 


13 


which proves Part (2). 

To prove Part (3), we notice that E a > AH. since the expectation of every term 
is at least 1. Let us consider a particular term 


N 




1 a 


jj>U)- 


i=l 


We have E> 1 if and only if some of the entries aij in the product belong to 
the same block i?,: X C 3 . O 11 the other hand, the maximum number of entries a tj 
which belong to the same block does not exceed 2 p. Therefore, if the number of 
blocks with more than one entry is k, 

E tw < ((2 p)\) k . 


Let us bound the number of terms t ( j y ,h with k entries belonging to the same block. 
We can choose a permutation <p e Sn in N\ ways and a subset I C {1,... , A r } of 
k indices in ways. For each entry with i E I we identify the block where 
<7</>(i) belongs. Hence we get k or fewer blocks since some of them may coincide. 
Now, for each i G I there are at most p choices of j G {1,... . A r } and at most p 

14 



choices of ip(j) such that and belong to the same block as <f{i). After 

that, there are (N — k)\ choices for i^(i) for i I. Hence 

E« 2 < jh ((2p)!)“ < (]V!) 2 exp{p 2 (2p)!}, 

k=0 

from which the proof of Part (3) follows. □ 

The bound in Part (3) is probably non-optimal. 

Proof of Theorem 1. 7. Let us define linear forms by 

n 

£i(x) — WijXi for i = 1, ... , m. 

3=1 


Let 

m 

L(x) = JJ^(x). 

i—\ 

As follows from the discussion of Section 1.2, the number of weighted tables can 
be expressed as the coefficient of x c = x c f ■ ■ ■ x^ 1 in the product L(x) divided by 
ri! • • •r m !. Using the scalar product of Section 2, we write the number of weighted 
tables as 

(L(x), x g ) 

ci! ■ • -c n \r i! • • -r m !' 

Since both L(x) and x c are products of linear forms, by Lemma 2.3 their scalar 
product evaluates by the permanent of the matrix of pairwise scalar products of 
linear forms U(x) and e ? (x) = x 3 . which is the matrix A. □ 
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